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We study some properties of eigenvalue spectra of financial correlation matrices. In particular, we 
investigate the nature of the large eigenvalue bulks which are observed empirically, and which have 
often been regarded as a consequence of the supposedly large amount of noise contained in financial 
data. We challenge this common knowledge by acting on the empirical correlation matrices of two 
data sets with a filtering procedure which highlights some of the cluster structure they contain, 
and we analyze the consequences of such filtering on eigenvalue spectra. We show that empirically 
observed eigenvalue bulks emerge as superpositions of smaller structures, which in turn emerge as 
a consequence of cross-correlations between stocks. We interpret and corroborate these findings in 
terms of factor models, and and we compare empirical spectra to those predicted by Random Matrix 
Theory for such models. 
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I. INTRODUCTION 



In Physics, Random Matrix Theory (RMT) is mainly used to model systems of particles interacting according to 
unknown laws. This is particularly handy for studying energy levels of complex systems such as heavy nuclei and 
mesoscopic systems. In such cases, the Hamiltonian operator can be conveniently described by a random matrix 
featuring some suitable symmetry properties. In particular, two matrix ensembles have been commonly used: the 
Gaussian Orthogonal Ensemble of real symmetric random matrices, and the Gaussian Unitary Ensemble of Hermitian 
random matrices [l|, |2(. In both these cases, for proper normalization of matrix elements, the asymptotic statistical 
properties of the eigenvalues follow the so-called semicircle law: 



p(A ) = r^VI^A-", (1) 

where p(X) is the marginal probability density function of the eigenvalues. Until recent years, physicists often neglected 
the study of random correlation matrices, even though they find applications in very diverse fields ranging from biology 
to econometrics. For this reason, applied mathematicians have studied such objects since the 1920s [3j. The asymptotic 
eigenvalue statistics in this case is given by the Marcenko-Pastur distribution Q , which will be extensively discussed 
in the following sections. Since the late 1990s, thanks to the growing interest in financial markets as prototypes of 
complex systems, physicists started working on random correlation matrices [HQ, and this will be the subject of this 
paper as well. 

We consider a set of N stocks whose spot price at time t we denote as Si(t), i = 1, . . . , N. Let t\, . . . , £t+i be T + 1 
equally spaced time instants, then we introduce the corresponding log-returns 



log^r^; (2) 



typically, one can think of the ti as days. This notation is a little redundant, and we can simply denote time steps 
as j = 1, . . . , T + 1. Now, we can assume that the T recorded log-return values are realizations of N x T random 
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variables Rj, so that we globally end up with NT observations ry, i = 1,...,N, j = 1, . . . , T. Equivalently, the 
vector 

r, = (r it i, . . . ,r ltT ) (3) 

containing all the observations of the ith asset returns, can be seen as a realization of a vector random variable R.W. 
Such a framework is fully characterized by finite probability distributions @, [1[ : 

p e 4\ . . . , 4 1} e 4 X > ; . . . ; < } e 4^, . . . , R { T N} e 4*°) = p (r« e B«; . . . ; RW e flW) (4) 

where Aj € M Vi, j and £?W € M T Vi. Depending on the choice of the random variables R^s, such a picture allows for 
a huge variety of possible descriptions of the stochastic dynamics of financial data. Most simply, a standard assump- 
tion, according to which the log-returns are described by uncorrelated Gaussian processes ((f ■ ■ • , rzv,i) ~ A/"(0, In), 
where ljv represents the N x N identity matrix), could be adopted. However, as is well known correlations often 
play a major role, and a realistic description of financial markets should by no means neglect them. Still, a Gaus- 
sian framework can be retained by observing that a set of zero-mean correlated Gaussian numbers generated by a 
stationary stochastic process is completely characterized by its expectation value vector /it and covariance matrix £: 

£ ijM = E [njTki] ■ (5) 

Following 10], in this paper we shall simplify this structure to the assumption that cross-correlations between assets 
and auto-correlations in time factorize: 

£ij,kl = CikAji. (6) 

In the above equation Cik (Aji) represents an element of a N x N (T x T) positive-definite symmetric matrix C (A). 
We shall keep this same kind of notation, i. e. denoting matrices by bold letters and the corresponding matrix elements 
by the same non-bold letters, throughout the rest of the paper. We shall assume the C matrix of cross-correlations to 
be constant over time. Also, most importantly, we shall neglect all possible correlations in time by assuming A = It- 

£ij,ki = CikSji. (7) 

This last assumption is well motivated both from an empirical viewpoint [91 and a theoretical one, since asset returns 
can be shown not to display auto-correlations whenever assets are assumed to be described by a sub-martingale. As 
a matter of fact, from the sub-martingale property one can show that 

E[r 4J r 4 , fe ] ~ E^n.fe] - (8) 

where 



_ Sj(t j+ i) - Sjjtj) 
Sib) 



assuming no dividend payment in the period and Si(tj + i) — Si(tj) -C Si(tj). This relation means that returns are 
uncorrelated (not necessarily independent) random variables, as can be empirically verified Q. In the following, we 
shall always assume the previously mentioned condition (Si(tj + i) — Si(tj) <C Si(tj)) to be fulfilled, thus allowing to 
identify log-returns and returns (as in equation ([9])). 

The Gaussian probability measure leading to the correlation structure (|7|) can be shown to be 

P(R)DR - (2,)^/4etCr/ 2 (->RTC-r) ™ (10) 

where R is a rectangular N x T matrix containing all of the returns observations {Rij — ry), while DR = 
IliLi nj=i diJjj is the flat integration measure over matrix elements. 
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Being symmetric, the C matrix in (fTO)) is made of N(N + l)/2 independent entries. Now, the typical challenge to be 
faced in many multivariate analysis problems is to estimate these numbers from TV time series of T observations, i.e. 
NT data points. When such data are collected in a N x T matrix R as in (fTOj) . then a standard estimator for C is 
given by the matrix c = RR T /T. In other words, an estimator for the matrix element Cij in C is given by 



cn = j;^2RitRjt, (ii) 
t=i 

which is the well-known Pearson estimator for large T (for small values of T the 1/T factor would need to be replaced 
by 1/(T — 1)). Of course, the c^s are a noise-dressed representation of the Cys. As a matter of fact, even though the 
random variables in R were exactly described by the probability distribution in (jlOp (i.e. by the correlation matrix 
C), the finitencss of the data sample under study would anyway cause the CijS to deviate, on average, from their 
"true" counterparts C'ijS. As it is intuitively clear, the two will become closer as more observations are collected, i.e. 
as T — > oo, or equivalently as q — > 0, where q is the so called "rectangularity ratio": 



N 

Q=f (12) 

However, realistic situations in financial practice typically involve large numbers of variables and similarly large 
numbers of observations. Ideally, this is not far from a "thermodynamic limit" situation in which 



N, T — ¥ oo , with — = q = constant. (13) 

Remarkably, this is precisely the regime under which some powerful RMT results are valid 0|. In particular, this is 
the limit under which it is possible to make analytical statements about the relation between the eigenvalue spectra 
of the theoretical covariance matrix C and its estimator (fTT|) pH4l3l |. We shall exploit those results in the following 
sections. 

A very general class of models fulfilling ([7]) is the one of the so called factor models Such models aim at 

describing the time evolution of each asset in terms of a few "driving forces" , or factors, which typically describe the 
impact that a market sector or the whole market itself have on a given asset. In a K factors model, the time evolution 
of asset returns is given by 



K 

r it =r i (t) = Y,9i j) ™j(t)+9l 0) e i (t), (14) 

where and g^ are constant parameters, whereas rrij and 6j are independent and identically distributed normal 
random variables. We shall assume these latter to be normalized as follows: 



E[m,i = E[ei(t)] =0 (15) 
¥.[m t (t)m 3 {t r )] = E[e i (t)e j (t')} = 6 ij S tt , 
E[m i (i)e i (t')] = 0. 

In the next section we shall specialize the model in (|14p to a particular case. However, in a very general fashion, 
factor models have proven to be able to reproduce, at least qualitatively, some relevant features of empirical covariance 
matrix eigenvalue spectra. 

The general appearance of the return covariance matrix eigenvalue spectrum of a given number of assets (for zero 
mean and unit standard deviation data) is the one depicted in Figure [T] for the log-returns of the daily prices for the 
assets composing the S&P500 and FTSE350 Indices. Three main features are clearly visible: a large bulk close to 
zero, a number of larger eigenvalues "leaking out" of such bulk, and a much larger and isolated eigenvalue. Since the 
pioneering works [1, Q, RMT has become a standard tool to analyze these macroscopic features. More specifically, 
the aforementioned eigenvalue bulk has mostly been identified with the Marcenko-Pastur distribution [4|, i.e. the 
limiting eigenvalue marginal probability density for the (already introduced) matrix c = RR T /T when all the entries 
Rij are drawn from a normal distribution A/"(0, cr). Quite importantly, this result is rigorously derived only in the 
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FIG. 1: (a) Empirical eigenvalue density of the covariance matrix for T = 3400 daily returns of N = 396 assets belonging 
the S&P500 Index over the years 1996 — 2009. (b) The same density as in (a) without the largest eigenvalue, (c) Eigenvalue 
density for T = 1423 daily returns of N = 243 assets belonging to the FTSE350 Index over the years 2005 - 2010. (d) The 
same density as in (c) without the largest eigenvalue. All figures were produced with standardized data. In (b) and (d) the 
Marcenko-Pastur distributions for the corresponding values of q — N/T are also plotted. 



thermodynamic limit (|13j) of infinite matrix sizes growing to infinity at a fixed rate. In this limit, the Marcenko-Pastur 
distribution reads 



m 1 y/(A+ - A) (A- A_) . 2 a , 1R x 

Pc( -^ = 2^2^ A ' A ± = cr ( 1± v / 9) ( 16 ) 

where q is the rectangularity ratio defined in (|12j) . However, as it can be seen in Figure [T] (b)-(d), the Marcenko- 
Pastur distribution actually provides a very poor fit of empirical distributions when q and a are assumed to be equal 
to N/T and 1 (for standardized data), respectively. The aforementioned eigenvalue bulks are reasonably well fitted 
by a Marcenko-Pastur distribution only when q and a are assumed to be free parameters, whose values are to be 
determined via fitting. In particular, this typically causes q to deviate from the ratio N/T, thus introducing the 
concept of effective system size. 

Since the Marcenko-Pastur distribution emerges as the limiting density for the covariance matrix of N uncorrelated 
time series made of T observations, identifying eigenvalue bulks such as the ones in Figure[T]with it basically amounts 
to state that most of the information contained in empirical covariance matrix spectra is actually no information at 
all, being equivalent to the spectrum one would obtain in the presence of pure noise [l~2l . |20| . On the other hand, this 
viewpoint allows one to give a specific meaning to the "large" eigenvalues out of the bulk. As it would also be possible 
to verify with Principal Component Analysis (PCA) [2l|, such eigenvalues correspond to groups of correlated assets, 
most typically belonging to the same market sector. Analogously, the largest eigenvalue of the distribution is usually 
identified with the "market mode" : such an eigenvalue appears as a consequence of those fluctuations that involve 
the market as a whole, and as a matter of fact the PCA can easily show it to account for a large part of the return 
variance. 

As already anticipated, factor models (|14p represent good candidates to reproduce most of the empirical features shown 
in Figure[T] In the following sections, we shall make use of such models to challenge the previously mentioned common 
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knowledge, according to which the eigenvalue bulks in empirical covariance matrix spectra essentially correspond to 
noise. Such a common knowledge has already been revised critically in a number of works (see for example jl3l . [22| — 
Hf|), and in this paper we wish to present an additional amount of evidence in this direction. 

The paper is organized as follows. In Section II the "direct" problem of analytically estimating eigenvalue densities 
is addressed. In particular, some specific versions of the factor model in equation (| 14[) will be introduced and the 
eigenvalue spectra for the correlation matrix C of such model will be derived (sometimes performing approximations) . 
Then, the RMT results provided in [l2|, [l3| will be applied in order to derive exact results for the noise-dressed 
version c of the correlation matrix. Eventually, a subsection will be devoted to discuss the results obtained via Monte 
Carlo simulations in order to validate those analytical results. In the light of such numerical results, we shall also 
briefly discuss again the applicability limits of the Marcenko-Pastur distribution. In Section III, the "inverse" problem 
of inferring eigenvalue densities from the empirically observed ones will be discussed. More specifically, a filtering 
procedure will be devised in order to highlight some of the cluster structure in empirical correlation matrices. Such a 
procedure will be performed on two data sets (relative to the S&P500 and the FTSE350 Indices), and the results will 
be interpreted in terms of factor models. Eventually, in Section IV some conclusions and possible future perspectives 
of this work will be outlined. 

II. THEORY: THE DIRECT PROBLEM 
A. Cluster models: heuristic analysis 



Let us now specialize the factor model (| L4|) . In particular, let us start from the situation where all asset returns 
obey the following equation 

r l {t)= lN m N {t) + {l- lN )e l (t), (17) 

where mjv(i), ~ A/"(0, 1) Vt and 7jv G [0, 1]. In the previous equation m^r represents a common mode driving all 
assets with the same "intensity" 7jv. We shall now build K clusters of correlated assets from equation (TIT)) . Thus, 
let there be K groups of N k variables (k = 1, . . . , K) with N = J2 k =i N k < N, and let us order the assets so that ri 
belongs to the fcth assets for i = 1 + J2*i=i Ni,. . . , Yli=i Ni. We can also denote the generic element in the kth cluster 
as rj . We shall define it as 

fe-i k 

rf\t)= lk m k {t) + {l- lk )n{t) , * = l + ^7V i ,...,^iV i (18) 

i=i i=i 

where ^ k € [0, 1], m k (t) ~ 7V(0, 1) is a cluster mode and r.i is as in equation (jlT[) . Thus, we can rewrite the previous 
relation as 



fe-i 



r\\t) = lk m k {t) + {I - lk ) lN m N {t) + {I - lk )(l - lN )t,{t) , i = 1 + J2 N l> ■ ■ ■ > Nl ' 



(19) 



1=1 



1=1 



We still simply call r, (i = 1 + N, . . . , N) those elements which do not belong to any cluster, and we assume them to 

evolve according to (fT7|) . We always have E[rj(t)] = E r^\t) = V,i, j, k,t. Recalling the relations in (IT51) . which 

can be generalized to include mjv in a straightforward way, we can calculate all possible covariance matrix elements 
between assets described by (fTT|) and (jT9")) . Four separate cases can be distinguished: 



E 



E 
E 



E[ri(t)rj(t)] = 



(1 -7jv) % +7jv 

(1 - lk)lN 

(1 - 7fe ) 2 (l - lN) 2 S tJ + (1 - 7 fc) 2 7^ + ll 

(1 - 7)0(1 - 7 0(1 ~ 1n) 2 5 13 + (1 - 7 fe)(l " li)l 2 N- 



(20) 
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We are now in position to compute the correlation matrix C of the model, whose matrix elements read 



3 VVarfc^Var^)] 
with straightforward generalizations to those involving elements belonging to clusters. 

We shall focus for now on the limiting case in which correlations between cluster elements are very strong, i.e. when 
7fe — > 1 in each cluster. One can see from (|20[) that under this assumption the model's correlation matrix has a simple 
block-diagonal structure: 





E^ 2 ) 



C = 





V o 



E(^) 



(22) 



where E M is the M x M matrix whose entries are all equal to unity (Efj = 1 Vi, j), while N *> is a (N—N) x (N—N) 
matrix with a slightly more complicated structure: 



F (N-N) 



1 

7^ 



(l-7«) 2 +7^ 



\ 7w 



7w 

(l-7«) 2 +7^ 



7jV_ 



7w 

(l-7«) 2 +7^ 



1 



(23) 



(l-7«) 2 +7^ (l-7«) 2 +7l 

The block structure in (|22j) allows for the computation of the eigenvalue spectrum. In fact, since we have 



det (e( m) - Mm) =A m -\M-A) (24) 



det (f^-^ - AI f 



(N-N) 7 % + (1- 7n ) 2 _ \ f (\- lN f A 



N ~ N J \ (i-7 W ) 2 +7^ vv(i--xH + -; 

the characteristic equation for the C matrix reads: 



N-N-l 



, \2 \ N-N-l K 

(t^fW) n ( A-*)=o. 

This eigenvalue spectrum is able to reproduce, at least on a heuristic level, some of the features of empirical spectra 
(see FigureQ]): each cluster gives rise to a large eigenvalue equal to the cluster size Nk, and the common mode produces 
one large eigenvalue ~ (N — N)^ /((l — 7at) 2 + 7^) too. It is worth mentioning that this latter eigenvalue might not 
necessarily be the largest one: as a matter of fact, large enough iV^s and a small can lead to situations in which the 
largest eigenvalue is given by maxj A^^- Even though this seems not to be the case in most financial applications, it is 
still worth stressing that the largest eigenvalue in empirical spectra should not be labelled as the "market eigenvalue" 
right away, but only after some further checks (as, for example, the inspection of the corresponding eigenvector). 
Going back to equation ([23)1 . a (N — N — l)-fold degenerate eigenvalue (equal to (1 — 7at) 2 /((1 — jn) 2 + 1%)) can 
be recognized. Also, equation (|25|) indicates that each cluster gives rise to Nk — 1 zero modes, altogether forming a 
group oi N — K zero modes. In a noise-marred situation, as it can be verified by means of Monte Carlo simulations, 
the degeneracies in (|25p are broken and give rise to two bulks. In the highly correlated cluster assumption (7^ — > 1) 
yielding (|25[) the two bulks typically remain well separated. However, when such assumption is relaxed, allowing for 
small values of the 7fcS, the two bulks get closer, and for properly chosen values of the parameters they eventually 
"collide" and merge into one single structure (see Subsection II C and the figures in it). This latter might be identified 
with the typical eigenvalue bulks appearing in empirical spectra (see Figure[l|. It is important to stress, already at this 
heuristic level, that the emergence of such a bulk in this factor model stems from the presence of (weak) correlations 



A 



N~K 



A 



( N ^ N)l 2 N + [l _ lN y 
(l-j N ) 2 +ll 
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between the assets, oppositely to the Marcenko-Pastur distribution (fl6|) . which in turn, as already discussed, originates 
from pure noise. Nevertheless, quite subtly the Marcenko-Pastur distribution can still provide good fits to such bulks 
in a number of situations, as we shall illustrate later. 

The previous factor model, yielding equation (|25p for the eigenvalue spectrum of its correlation matrix, can be further 
simplified to the case where no common factor is driving the asset returns. This can be directly achieved on the 
eigenvalue spectrum by setting 7 at = in equation (|25]). This gives 



K 

A R - K (A-l) N -"l[(A-N k ) = 0. (26) 
fc=i 

This spectrum still yields one large eigenvalue for each cluster and two degenerate eigenvalues equal to zero and one, 
respectively. Just like in the previous discussion, let us now relax the assumption of strong correlations (7/. — > 1) 
within clusters. For the sake of simplicity, let us assume that all assets in each cluster are mutually correlated with 
the same correlation coefficient p k G [0, 1] (which can be explicitly computed from (|20p ). So, the correlation matrix 
of the model would read 



/ E^) 
E^ 2 ) 



C = 





V 















\ 




(27) 



1, while the last block is now 



where each E^*) is a N k x Nk matrix such that E\^ k ' — p k for i ^ j and E\ l 
given by the identity matrix (as it can be seen from the first relation in ([20]) for 7 at = 0). One can verify that 



dct (e« - AI Wfc ) = [A - (1 - Pk )] Nk - 1 [A - {N kPk + (1 - Pk ))] . (28) 

In order to further simplify things, let us consider the case where we have just one cluster of N assets with mutual 
correlation p. Equation (|25|) in this case would need to be modified to read 



[A - (1 - p)}^ 1 (A - 1)""*[A - (Np + (1 - p))] = 0, (29) 

thus giving one large eigenvalue, and two degenerate eigenvalues equal to (1 — p) and one. The latter emerges as a 
consequence of the N — N mutually uncorrelated assets, i.e. as a consequence of pure noise, while the former is due 
to the presence of a cluster. Just like in the case discussed previously, a noise-dressed version of ([29]) would lead to 
two eigenvalue bulks, and suitably chosen values of p would make the two bulks merge into one (see Subsection II 
C). Thus, in this case too, the emergence of a main bulk would not be a consequence of pure noise alone. 



B. Cluster models: exact results 



The proper mathematical framework to deal with covariance matrices featuring degenerate spectra (as the ones in 
equations (1251) . ([26)) and ([29]) ) is the one provided in [H, [Hj], and we shall exploit it extensively in the following. So, 
first let us introduce some basic notions and notations of RMT. Just like we did so far, we shall denote the eigenvalues 
of the correlation matrix C of a given model as A$ (i = 1, . . . , N), while the eigenvalues of the corresponding estimator 
([TTj) will be denoted as A^. Quite straightforwardly, one can define the eigenvalue density for the theoretical correlation 
matrix as 



1 N 

^ctAJ^^^A-A,), (30) 



N 
t=i 



and this is related to the matrix moments M c 



(fc). 
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i 1 N r 

M W = _ Tr C k = -Y, A* = / dA Pc (A)A fc . (31) 

i— 1 

In analogy to (1301) . one can define an expected spectral density for the estimator c in equation (1111) : 



1 N 

^(ah-^te^a-a,)], ( 32 ) 

i=l 

where the expectation is to be meant with respect to the probability measure (|10[) . Generalizing (I3ip . we can then 
define the expected matrix moments as 

m W = _L E [ Trc fc ] = J dAp c (A)A fc . (33) 
The two corresponding resolvents, or Green's functions, are given by: 

G C {Z) = (ZIn-C)- 1 (34) 
g c (z) = E (zI N - c)~ 

where Z,zeC. Then, one can introduce the moment generating functions, and it is possible to show that they are 
closely related to the Green's functions in the following way 

°° 

M C (Z) = J2^ = ZG C (Z)-1 (35) 

co (k) 

m c [z) = ^2—^- = zg c (z)-l, 

where we have 



z n 
fe=i 



Gc(z) . ( 36, 

Moreover, from the well known relation lim e ^ +(A + ie) -1 = "P(A _1 ) — i7r5(A) (where V denotes the principal value), 
one can show that the eigenvalue densities ([3H)l and (f3"2j) can be directly derived from the corresponding Green's 
functions (l34l) : 

Pc (A) = -- lim ImG c (A + ie) (37) 
Pc(A) = lim Img c (A + ie). 

So, basically, the Green's function contains the same information as the whole eigenvalue density, and the same, 
through (135|) . is also true for the moment generating function. In particular, for A, A > 0, the previous relations can 
be converted into: 



pc (A) = lim ImM c (A + ie) 

7rA e->0+ 



(38) 



Pc(A) = lim Imm c (A + ie). 

7rA e->0+ 
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A fundamental relation between between the moment generating functions of a "true" correlation matrix and its 
estimator in the infinite matrix size limit (|13l) can be derived |12|. [l3j either in the framework of Free Random 
Variables or using planar diagrammatic methods fl2l.l28j]. This derivation will be outlined in Appendix A. 

The starting point is the following simple relation between moment generating functions 

m c (z) = M C (Z), (39) 
where the two complex arguments are related by the following transformation: 

(40) 



l + qm c (z)' 

Once Mc{Z) is known, m c (z) can be derived in principle from the following functional equation: 



m c (z) = M C — . (41) 

V 1 + qm c ) 

Bearing in mind the previous discussion on factor models, we shall focus on correlation matrices whose spectra 
display degenerate eigenvalues. Let us then assume the correlation matrix C to have L distinct eigenvalues Aj 
(i = 1, . . . , L) with degeneracies nj. The moment generating function for such a matrix is given by 

M C (Z) = — > — = > — (42) 
v ' N^Z-At ^Z~A % v ' 

i—l i—1 

where the weights Wi — 71%/ N have been introduced. Thus, from (|41p we get 



For each fixed z, this becomes a polynomial equation of degree L + 1 in m r .(z)rn, yielding as many solutions. The 
problem arises of choosing the right one: as extensively discussed and detailed in [23], the right branch of the map in 
equation (|40[) to pick up is the one giving Z z for z —¥ oo. In the simplest case one has C = Ijy and, of course, the 
correlation matrix has just one iV-fold degenerate eigenvalue equal to one: it can be shown that, in this case, equation 
(l43t leads precisely to the Marcenko-Pastur distribution (fl"6|) . as one would expect. On the other hand, already when 
considering two distinct eigenvalues, quite different scenarios are possible, including the previously discussed cases of 
well separated or merging bulks (see the next subsection). Let us also remark that equation (|43[) cannot be applied 
to the large non-degenerate eigenvalues typically displayed by factor models. This is because, as already stated, we 
shall always work in the thermodynamic limit (fl3)) . where the weight (1/N) of such eigenvalues vanishes as N — > oo. 
As a matter of fact, this kind of eigenvalues need to be investigated per se, and actually extensive areas of the RMT 
literature are devoted to the study of statistical properties of single eigenvalues as well as order statistics [3(j- In 
particular, it has been shown in [31j that large non-degenerate sample eigenvalues of correlation matrices follow a 
normal distribution (see the next subsection for a numerical confirmation). 



C. Monte Carlo simulations 



In this subsection we present and detail the Monte Carlo simulations we performed in order to test and validate 
the analytical results described so far. In all cases, we generated T realizations of N stochastic processes described 
by the factor model introduced in equations (fT5)) , (fH)|) and (I2TH (from a numerical viewpoint, this just boils down to 
the generation of standard Gaussian random numbers). By choosing different parameter values, we implemented the 
different versions of the model which were discussed in the previous subsection, corresponding to different theoretical 
correlation matrices (equation (f22j). (|27p). The eigenvalues of the corresponding estimators (|Tl1) were obtained via 
numerical diagonalization (by means the diagonalization algorithm provided by Matlab®). 

In Figure [U a first example of eigenvalue spectra deriving from factor models is presented. In this first example 
a common mode (introduced via a non-zero 7jv coefficient, see the figure caption for all the details on parameter 
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FIG. 2: (a) Eigenvalue density for 100 simulations of the factor model described in (fTH]>. (fT9l) and ([H with N = 500, T = 2000 
(g = 0.25). One cluster made of Ni = TV = 100 variables, correlated via a coefficient 71 = 0.7, is present. A common mode 
is introduced via a coefficient 7jv = 0.3. As expected from equation (|25l) . two eigenvalue bulks are clearly visible. The mean 
value in the bulk on the left is 0.04, while equation (|25[) would predict zero as a consequence of the strong correlation limit 
("/k — > 1) approximation. On the other hand, the mean value in the bulk on the right is 0.85, in remarkable agreement with the 
predicted value (1 — 7at) 2 /((1 — Jn) 2 + 7at) = 0.84. For the sake of readability, the "large" eigenvalues are not shown, (b) By 
setting 71 = 0.4, the two bulks in (a) merge into a single one. Such a structure, despite emerging as a consequence of (weak) 
correlations, is very well fitted by a Marcenko-Pastur distribution (see equation 1)161) ) with q = 0.29 and a = 0.88. 



values) as well as a correlated cluster of variables are present. As already discussed in the previous subsection, the 
degeneracies in equation ([25} are broken, and, in the limit of strong correlations in the cluster (7/. — > 1), two well 
separated eigenvalue bulks emerge (Figure [2] (a)). On the other hand, when such correlations get weaker, the two 
eigenvalue bulks get closer, eventually melting into one single structure (Figure^. Remarkably, such a structure is 
quite well fitted by a Marcenko-Pastur distribution, which is however characterized by values of the q and a parameters 
that differ from the ones which would be obtained for standardized uncorrelated data (q — N/T and a = 1). 
Such features are further illustrated in Figure [3l which refers to the case of a factor model with no common mode 
(7n = 0). Again, the progressive fusion (induced by weaker correlations) between separated eigenvalue bulks is 
shown. Also, we compare the numerically obtained spectra to the eigenvalue densities obtained from the solution of 
equation (|43[) . obtaining a very good agreement between the two (Figure [3] (a)-(b)). Just like in the previous case, 
the Marcenko-Pastur distribution seems to provide quite a good fit of the "limi ting " eigenvalue bulk obtained for 
small correlations (Figure [3] (c)). However, we performed a Kolmogorov-Smirnov [32[ test under the null hypothesis 
of data distributed according to a Marcenko-Pastur distribution, and we found such hypothesis to be rejected for all 
the significance levels we considered (see the caption of Figure [3] for further details). On the other hand, the same 
test prevented us from rejecting the hypothesis of data distributed according to the eigenvalue density obtained from 
the solution of equation (I43[) , its degenerate eigenvalues being given by equation . This is quite surprising, given 
the great similarity between the two densities (see Figure El (d)), which would be almost undistinguishable if plotted 
on the scale of the whole distribution (as in Figure [3] (c)). Nevertheless, we believe this result to be quite relevant, 
since it strongly suggests that the theoretical framework outlined in the previous subsection might be the right way 
to describe and analyze empirical correlation matrices which display a cluster correlation structure, at least to some 
extent. In the following section, we shall apply these ideas to financial data. 

We also believe these findings to provide some interesting evidence against the use of the Marcenko-Pastur distribution 
whenever non-negligible correlations are present between random variables. Despite being close, in a number of 
situations, to the eigenvalue densities deriving from the solution of equation (|4"3"|) . the Marcenko-Pastur distribution 
always needs to be fitted on the data under study, even when they are completely under control (as in the case of 
Monte Carlo simulations). Then, as already pointed out, the presence of correlations causes the parameters q and a 
to deviate from the corresponding values which would be obtained in a pure noise situation. In particular, given the 
definition in equation (fT2)l . this leads to the introduction of the artificial, and possibly misleading, concept of effective 
system size. 

Eventually, concluding this subsection on Monte Carlo simulations, in Figure 2] we show a numerically obtained 
distribution of the largest sample eigenvalue for a factor model yielding the eigenvalue spectrum in As it can be 
seen by direct inspection, the corresponding histogram is well fitted by a Gaussian distribution, as already anticipated 
in the previous subsection. Moreover, three statistical tests (whose details are provided in the caption) were performed 
under the null hypothesis of Normally distributed data, and all the results we obtained prevent from rejecting such 
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FIG. 3: (a) Eigenvalue spectrum of the correlation matrix for the factor model yielding the spectrum in (|29[) for N = 500, 
T — 2000, N = 100 and p — 0.84. This model yields two degenerate eigenvalues: Ai = 1 — p = 0.16 and A2 = 1 (see also 
Figure [2]). The histogram is the result of 100 Monte Carlo simulations of such model, while the solid line represents the density 
obtained from the solution of equation (|43[) . (b) Eigenvalue spectrum for the same model with p = 0.65, i.e. for Ai = 0.35. 
It can be clearly seen that the two separated bulks shown in (a) start to merge as a consequence of the smaller correlations 
(smaller value of p). (c) Posing p — 0.30 the two eigenvalue bulks merge completely into one single structure. In analogy to 
Figure[2](b), such a structure is apparently well fitted by a Marcenko-Pastur distribution with q = 0.26 and a — 0.97, plotted as 
a solid line. On this scale, the Marcenko distributions would be barely distinguishable from the density obtained from equation 
(|43[) with Ai = 1 — p — 0.7 and Aa = 1. (d) Comparison between two such densities in correspondence of their peak, where 
they differ the most. Despite the quite small deviation between the two, a Kolmogorov-Smirnov (KS) performed on the data 
gave the following results. The critical values, for different significance levels a, are given by CVks(o! = 0.10) = 5.5 x 10 -3 , 
CV K s(a = 0.05) = 6.1 x 10~ 3 and CV K s(a = 0.01) = 7.3 x 10~ 3 . Under a null hypothesis of data distributed according to 
the Marcenko-Pastur distribution, the value of the KS statistic was STATks = 7.9 x 10 -3 , allowing for the rejection of the 
null hypothesis for all the significance levels considered. On the other hand, under the null assumption of data distributed 
according to the density obtained from equation (|43)l . we obtained STATks = 2.3 x 10 -3 , thus preventing from rejecting the 
null hypothesis. Clearly, the large statistics in this example plays a relevant role in helping the KS test to "distinguish" the 
two densities. Smaller data samples would prevent the Marcenko-Pastur from being rejected. 



hypothesis. 

III. EMPIRICAL DATA: THE INVERSE PROBLEM 

The goal of this section is to show that some of the features displayed in correlation matrix spectra of factor models 
are actually present in empirical spectra of financial correlation matrices too. In particular, our goal is to show that 
the empirically observed eigenvalue bulks, as the ones shown in Figure [TJ cannot be regarded as a consequence of 
pure noise at all. Indeed, by suitably filtering empirical data, it is possible to show a peak separation similar to those 
shown in Figure [21 and Figure [31 

Starting from the two data sets already introduced in a previous section (396 assets from the S&P500 Index and 243 
assets from the FTSE350 Index), we shall restrict our attention only to a relatively small number of properly chosen 
assets. This will be done in order to ideally recreate, as best as possible, the conditions under which the previously 
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FIG. 4: Distribution of the largest eigenvalue Amax from 5000 Monte Carlo simulations of the factor model yielding the 
spectrum in (|29[1 with p = 0.85, N = 500 and N = 100. The distribution is well fitted by a Normal distribution with expected 
value m = 84.79, very close to the theoretical value predicted by equation (|29[) : Np + (1 — p) = 85.15. Three different 
statistical tests (Jarque-Bera, Lilliefors and Kolmogorov-Smirnov) [32] were performed, assuming a null hypothesis of Normally 
distributed data. In the following we report the different critical values (CV) obtained for the different tests and for different 
significance levels a. Also, we report the statistic values (STAT), which, if smaller than the critical values, prevent the null 
hypothesis from being rejected. Jarque-Bera test: STATjb = 3.114, CV. IB (a = 0.10) = 4.605, CV JB (a = 0.05) = 5.992, 
CV JB (a = 0.01) = 9.210. Lilliefors test: STAT L = 0.78 x 10" 2 , CV L (« = 0.10) = 1.14 x 10" 2 , CV L (a = 0.05) = 1.25 x 10 -2 
and CV l (q = 0.01) = 1.56 x 10~ 2 . Kolmogorov-Smirnov test: STATrs = 0.78 x 10" 2 , CV K s(a = 0.10) = 1.73 x 10" 2 , 
CV K s(a = 0.05) = 1.92 x 10~ 2 , CVks(« = 0.01) = 2.30 x 10~ 2 . All of the previous results prevent from rejecting the 
hypothesis of Normally distributed data. 



discussed eigenvalue bulks emerge from factor models. We shall attempt to empirically recreate the block-diagonal 
correlation matrix in (1271) . which, when a single cluster is considered, yields the eigenvalue spectrum of equation (|29[). 
Let us rewrite that matrix in this specific case: 

°-(*V»)- 

recalling that E^ 1 " 1 = p for i ^ j and = 1. 

In order to reproduce the structure in (|44[). we start from the empirical covariance matrices (let us denote them as c, 
according to the previously adopted notation) of our data sets and apply the following procedure. 

• We first identify a small cluster of N strongly mutually correlated assets. If we denote the corresponding set of 
indices as Ijj, then we have 

<Hj > Pu , i,j'e lu (45) 

for some threshold value pu > 0. 

• Then, all assets which are weakly correlated to the elements in the cluster are pointed out. Amongst those, only 
the ones with small mutual correlations are retained. By grouping their indices in another set Id, we can write 

|cy | < p' D , ielu, j eI D (46) 
\c k i\ < P D , k,leI D , k^l 

for some threshold values p' D ,p D & {0,Pu) such that p' D < p" D . 

The first condition in fl4"6")) is meant to reproduce the zero off-diagonal blocks in ([4"4"]). while the second one is 
meant to reproduce the identity matrix in the right-lower block. 

• If we now redefine N to be the total number of stocks in Ijj and Id, so that Id contains N — N elements, and 
we properly sort them, then the approximation to f|44[) is given as follows 

c= ( Clu CIuJd ) , (47) 
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FIG. 5: Graphical representation of correlation matrices, (a) 40 x 40 correlation matrix for the selected returns belonging to 
the S&P500 Index, (b) 28 x 28 correlation matrix for the selected returns belonging to the FTSE350 Index. In (a) and (b) 
the stocks have been sorted in order to highlight the cluster structure, (c)-(d) Model correlation matrices corresponding to the 
cases shown in (a) and (b), respectively. In all plots, white diagonal blocks correspond to ones, while black ones correspond 
to zeroes. Gray shadings are intermediate values. As already explained in the main text, the gray shadings in (c) and (d) 
correspond to p = 0.712 and p = 0.707, respectively. The presence of unresolved structures in (a) and, less evident, in (b), 
suggests that the model matrix in (|44)l . depicted in (c) and (d), does not fully capture all empirical features. 



where ci u and cj D are square matrices (of dimensions N and N — N respectively) containing the correlation 
matrix elements pertaining to the two sets Ijj and Ip. On the other hand, the cj u j d matrix (ci d j u being its 
transpose) contains the "interaction" terms between the two sets. 

The goal of such a construction is to empirically make contact with the spectrum in (|29[) . As a matter of fact, for 
suitably chosen threshold values pu, p'o an d p'b, we expect the eigenvalue spectrum of the c matrix in (|47|) to be the 
noise-dressed version of the one in (j2"9")l . In particular, small values of p' D and p" D should guarantee the cj D block to 
yield N — N eigenvalues close to one. On the other hand, the block E( w ) in (|44l) yields N — 1 small eigenvalues equal 
to 1 — p and a large one equal to Np + (1 — p). Now, it is reasonable to assume p to be equal to the average mutual 
correlation between the assets in Ijj 



and to suppose that cj u will produce N — 1 eigenvalues close to this value. 

In the following we present and discuss the results we obtained applying this procedure to our datasets. In the case 
of the S&P500 Index, we identified a cluster made of As&p = 7 strongly mutually correlated assets (ps&p = 0.712, 
with ps&p computed as in (03)), all of which happen to belong to the energy sector. We then identified a group of 33 
stocks, belonging to various sectors, which satisfy the previously described requirements: a small mutual correlation 
(mean value = 0.099) and a small correlation with the TV elements in the cluster (mean value = 0.096). So, all in 
all we have A*s&p = 40. Analogously, also in the FTSE350 Index case we were able to identify a cluster made of 
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Aftse = 7 highly mutually correlated stocks (pftse = 0.707), all corresponding to investment trusts. In this case, 
however, we only found 21 more stocks (so that Aftse — 28) satisfying the aforementioned requirements (mean value 
of mutual correlation = 0.015, mean value of correlations with elements in the cluster = 0.014). In Figure[5l graphical 
representations of the empirical correlation matrices we obtained, and a comparison to the theoretically expected 
ones, are shown. As can be seen by direct inspection, the correlation matrix we obtain in the FTSE350 Index case 
is remarkably similar to the one in (|44[) . whereas the one we obtain for the S&P500 Index has some further inner 
structure as a consequence of the much higher mean correlations. 

In Figure [5] the eigenvalue spectra we obtained from the previously discussed correlation matrices (the ones reported 
in Figure [5] (a)-(b)) are shown. In particular, in Figured (a)-(b) we plot the spectra obtained, respectively, from the 
S&P500 and FTSE350 correlation matrices constructed according to the clustering procedure outlined previously. In 
both cases, two distinct eigenvalue bulks can be noticed. The smaller bulks on the left are made of 6 eigenvalues, and, 
since we have As&p = Aftse = 7, this is in agreement with the prediction given by equation (|29p for N — 7. Also, 
in the FTSE350 Index case (Figure [5] (b)) the larger eigenvalue bulk around one is made of Aftse — Aftse = 21 
eigenvalues, which is again in agreement with (|29|) . while the largest eigenvalue in the spectrum (not shown in the 
plot) is equal to 5.235, remarkably close to the prediction given by Aftse Pftse + (1 — Pftse) = 5.242 (see again 
equation (|2"9")l ). On the other hand, the spectrum relative to the S&P500 Index yields two large eigenvalues (not 
shown in Figure [6] (b)) equal to 3.552 and 6.483, and neither value is in agreement to the large eigenvalue prediction 
As&p ps&cP + (1 — Ps&cp) = 5.272. Such discrepancy is due to the unresolved correlation structure in the empirical 
S&P matrix (see Figure [5] (a)), which gives rise to additional sub-clusters. 

In Figure H](c)-(d) the two eigenvalue bulks we just discussed are fitted with the eigenvalue density deriving from the 
second equation in (13"5|) when applied to the solution of (|4"3"]l , i. e. the moment generating function m c of the noise- 
dressed version of a correlation matrix C with degenerate eigenvalues. In both cases we consider correlation matrices 
with two degenerate eigenvalues in order to try to fit the two main bulks. The smaller eigenvalue Ai, responsible for 
the emergence of the smaller bulks on the left, is assumed to be equal to 1 — p, accordingly to equation (|29|) . So, in 
the two different cases we analyzed we have 



AsfcP - 1 
A^s&p - 2 
Aftse — 1 

ApTSE — 1 



Ai s&P = 1 - Ps&p = 0.288 , w lskr = , r b&P - (49) 
Aiftse — 1 ~ Pftse = 0.293 , wi FTSE 



where the two slightly different weights are justified by the previously mentioned fact that in the S&P case there are 
two isolated eigenvalues which separate from the main bulks, while in the FTSE case there is only one such eigenvalue. 
On the other hand, the larger eigenvalue A2, which according to (|2"9"|) should be exactly equal to one, is assumed to 
be equal to the empirical mean value of the main bulks on the right in Figure [6] (c)-(d). These are found to be 



A 2s&p = 0.887 , A 2ftse = 0.997 (50) 

and one might notice that, again, the value obtained in the FTSE case is in excellent agreement with the theoretically 
expected one. So, all in all, the curves drawn in Figure |5] (c)-(d) are obtained from the values in equations (1491) and 
(|50p . Such curves, as already mentioned, are fitted to the empirical spectra. However, a bootstrap approach was 
adopted in order to improve the statistics. More specifically, for each bootstrap iteration a random sampling on the 
weakly correlated stocks was performed, picking 30 out of 33 in the S&P case and 18 out of 21 in the FTSE case. On 
the contrary, the stocks forming the highly correlated clusters were always kept (thus keeping the eigenvalue bulks 
on the left almost unchanged). As it can be seen in Figure |5] (c)-(d) the agreement between theory and prediction is 
very poor. This is essentially due to the additional correlation structures in the empirical correlation matrices (see 
Figure O, which are neglected in the model matrix (|4"4l and in its eigenvalue spectrum (|2"9"|) . All the bulks displayed 
in Figure [B] (c)-(d) appear to be "smeared" versions of their theoretical counterparts, even the small ones relative to 
the eigenvalues in (|49[) . Interestingly, this shows that inhomogeneities in correlation structures have quite an impact 
on eigenvalue spectra even on a "small scale" (let us recall that As&p = Aftse = 7) . 

In Figure H] (e)-(f) the same fit as the one just discussed is performed, the only difference being that an additional 
random reshuffling of the returns is performed on the bootstrapped assets. Such an operation is meant to destroy all 
possible correlations, and this leads to a quite good agreement between data and predictions on the bulks on the right 
(the theoretical densities being now computed with A2 = 1, accordingly to equation (l29l) ). This essentially confirms 
that the substantial deviations shown in Figure [5] (c)-(d) can entirely be imputed to the unresolved cluster structures 
in the empirical correlation matrices. The same kind of analyses (bootstrap and reshuffling) were not performed on 
the stocks belonging to the correlated clusters because of their very small number. 
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FIG. 6: (a)-(b) Eigenvalue densities of the correlation matrices represented in Figure [5] (a)- (b), relative to the S&P500 and 
FTSE350 Indices, respectively. In both cases, one can clearly distinguish two well separated bulks, while the largest eigenvalues 
have not been plotted for better visualization (see main text for further explanation), (c)-(d) Comparison between the theo- 
retically expected spectra derived via equation ()43[) and the empirical ones. The latter have been modified with respect to (a) 
and (b) according to the following approach. A bootstrap random sampling (100 iterations) has been performed on the weakly 
correlated subsets of stocks, picking 30 stocks out of 33 in the S&P500 Index case and 18 out of 21 in the FTSE350 Index case. 
The presence of undetected structures (see Figure [5] (a)- (b) and main text) leads to a poor agreement between data and theory. 
The values and weights of the eigenvalues used to plot the theoretical density obtained from equation (|38|) are detailed in (|49l) . 
(e)-(f) As in (c)-(d) but with weakly correlated data reshuffled before bootstrap, leading to a much better agreement between 
data and theory. The reference eigenvalue for the bulks on the right is now assumed to be equal to one (see main text). 



All in all, the previous observations definitely suggest that the empirically observed eigenvalue bulks cannot be 
regarded as a consequence of the noisiness in financial correlation matrices. On the contrary, in the light of the 
previous discussions it could be co nject ured that bulks emerge from the interplay of several cluster structures like the 
ones we isolated (see Figure [12.Tl3l|. 
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IV. SUMMARY AND CONCLUSIONS 

Let us now summarize the main messages in the paper. 

• Several rough but useful results about spectral properties of financial correlation matrices, such as the position 
of large non-degenerate eigenvalues, can be inferred by a clever application of the direct problem (see Section II 
A). This only involves algebraic calculations, namely the solution of suitable secular equations. This approach 
can be used either when the cluster structure is known a priori, or when there are good reasons to assume a 
certain correlation structure. Combining the direct analysis with Monte Carlo simulations can provide a clear 
picture in a number of situations, avoiding the analytical difficulties of Random Matrix Theory, and keeping the 
finite-sized nature of the problem. Typically, one wishes to reproduce observed spectra starting from a factor 
model, and this can be done as follows. 

1. Identify the cluster structure in the dataset under analysis, using clustering algorithms [33]. 

2. Estimate the average correlations within clusters. 

3. Build a theoretical, "mean field" , matrix model C from the above estimates. 

4. Run Monte Carlo simulations of the matrix model. 

5. Compare the outcome of the simulation to the empirical spectrum. 

If the comparison is statistically satisfactory, the matrix model C can be retained and used for further analyses, 
such as portfolio selection. If not, the model is to be refined by abandoning the mean field assumption, at least 
for some cluster interactions. 

• As far as the largest eigenvalue is concerned, its distribution is not Tracy- Widom, but Normal [3(3, Hlj]. Moreover, 
this distribution cannot be derived from the thermodynamic limit formula f|43[) . In fact, such an eigenvalue is 
typically non-degenerate and its weight in a diagrammatic expansion of the Green's function would vanish as 
l/N, forA^oo. 

• For factor models, the bulks in empirical eigenvalue spectra come as the noise-dressed version of degenerate 
eigenvalues. Thus, such bulks encode the information on the cluster structure of the empirical correlation 
matrix c, and this can be evidenced by means of proper clustering methods, as done in Section III. 

While there would be no difficulty in studying non-Gaussian multivariate models by means of Monte Carlo simulations, 
the analytical results presented in Section II B and in the Appendix cannot be easily generalized. In fact, the integrals 
needed to calculate g c in (|34l) in the Gaussian case can be exactly obtained by virtue of Wick's theorem, whereas 
different stochastic models would require painful calculations. 

The diagrammatic method outlined in the Appendix allows, in principle, for the exact evaluation of the Green's 
function g c for any finite size N x T, as a function of N and T. Nevertheless, this is a series of 1/z powers, whose 
convergence properties would be interesting to investigate in the near future. 
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Appendix A: Diagrammatic method 

For the sake of completeness, in this Appendix we replicate the derivation, already detailed in [l2j], of equations 
(1591 and (|4"0"1) . The starting point is to expand the second Green's function in (jMf (let us now pose z = zIn, with 

z S C) 
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FIG. 7: Representations of the z 1 matrix, the It/T matrix and the propagator. 
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where in the last line the identity Tx T matrix has been systematically inserted between R and R T matrices, whenever 
they appear. The z _1 matrices in equation (|A1|) . being multiples of the identity matrix, could be safely pulled out of 
the expectation map (which is to be meant with respect to the probability measure in (|10|1). So, in order to compute 
a generic matrix element of the Green's function g c , one would need to compute n-point correlation functions of the 
following kind: E [R iltl Ri 2 t 2 ■ ■ ■ Ri 2n t 2rl ]- Following [l2j], two different kinds of indices have been considered in this 
expression: indices of the A-type (ranging from 1 to A) and indices of the T-type (ranging from 1 to T). Moreover, 
an even number of matrix elements has been considered, since the Gaussian probability measure (|10|) ensures the 
vanishing of all odd moments. Also, Wick's Theorem allows us to split any n-point correlation function into the sum 
of all possible products of two-point correlation functions (or propagators). For example, the four-point correlation 
function would read 



E [Ri 1 t 1 Ri 2 t 2 Ri 3 t 3 Ri i t i ] — E [Ri 1 t 1 Ri 2 t 2 ] E [Ri 3 t 3 Ri 4 t 4 ] + E [Ri ltl Ri 3 t 3 ] E [Ri 2t2 Ri 4 t 4 ] + E [Ri 1 t 1 Ri 4 t 4 ] E [Ri 2 t 2 R 
where the two-point correlation function is as in equation ([7|) : 



iit 3 \ 



(A2) 



E[R it R jt/ ] = C ij S tt ,. (A3) 

Let us now represent the main ingredients in the Green's function expansion as follows (see Figure [7]) . A- type indices 
will be represented as black circles, while T-type indices will be represented as grey circles. A z _1 matrix element 
will be represented as a straight solid line connecting two A- type indices, while elements of the It/T matrix will 
be represented as a dashed line connecting two T-type indices. The propagator in equation (|A3I) . in turn, will be 
depicted as a double arc connecting two pairs of indices, each made of an A-type and a T-type index. With the 
previous positions, the first few terms in the expansion of the Green's function look like in Figure HI The Green's 
function is represented as a grey circle within A-type indices, while the other diagrams correspond to the different 
contributions in the expansion in (|Alj) . As can be seen, such diagrams are divided into two categories: those with 
crossing lines (such as the one in the fourth line of Figure [5]) and those without crossing lines, also known as planar 
diagrams. It can be shown that the contribution of diagrams belonging to the former group vanishes in the infinite 
matrix limit (1131) . Intuitively speaking, this is because in planar diagrams closed loops (which give a contribution of 
order A for black lines and a contribution of order T for grey lines) and external horizontal lines (giving contributions 
of order 1/A for black lines and of order 1/T for gray lines) occur in equal numbers. So, in the thermodynamic limit 
the two contributions balance each other. On the other hand, non planar diagrams have extra 1/A factors which are 
not compensated by closed loops. So, in the thermodynamic limit, the Green's function is only composed of planar 
diagrams, whose building blocks are horizontal lines and "rainbow-like" structures, as it is easily seen from Figure [5] 
More formally, such rainbow diagrams are usually called one-line-irreducible (1LI) since they cannot be split into two 
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FIG. 8: Diagrammatic expansion of the Green's function (represented as a circle within two TV-type matrix indices). The 
diagrams on the first two lines represent the contributions of the z _1 matrix and the two-point correlation function (see Figure 
[7]) respectively. The other diagrams on the following lines represent the four-point correlation function, and in particular they 
represent the different contributions which arise by applying Wick's Theorem as in equation (|A2|) . The diagram on the fourth 
line is non-planar, and its contribution to the expansion is negligible when the thermodynamic limit (|13|l is taken (see the main 
text for further clarifications). Higher order diagrams are not represented. 



parts by cutting one horizontal line (either solid or dashed). It is then convenient to introduce the self-energy £ c , 
i.e. the generating function for such diagrams (see Figure [9j . Then, the Green's function can be expanded in terms 
of the self-energy, as in Figure [TD] Translating diagrams into equations, the meaning of Figure 1X01 is the following: 



gc(z) = z- 1 + z- x E c (z) z- 1 + z- x S c (z) z^SeW z- 1 + . . . = (z - Sc(z))- 1 . (A4) 

Thus, the self-energy somehow represents an "effective" matrix that replaces the c matrix by removing the expectation 
map in the Green's function expansion of equation (|A1[) . 

Parallel to the previous definitions, it is convenient to introduce a second Green's function for the matrix R T R, which 
has exactly the same eigenvalues of the matrix c = RR T (plus some possible additional zero modes, depending on 
the relative size of N and T). Let us define such Green's function as 



s (z)=E 



[T I 



T -Z 



(A5) 
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FIG. 9: Representation of the self energy as the generating function of one-line-irreducible (rainbow) diagrams. 






FIG. 10: Expansion of the Green's function in terms of the self-energy. 



in such a way that g c would be represented by a diagrammatic expansion in which iV-type and T-type indices (and 
consequently also dashed and solid lines) would switch roles with respect to the case of g c , represented in Figure [8] 
Also, by defining a new self-energy function S c as the generating function for 1LI diagrams with switched indices and 
lines (with respect to the 1LI diagrams generated by S c ), it is possible to write the following relation 



gcO) 



Tin 



Ec(*) 



(A6) 



in complete analogy with equation (IA4|) . 

As already stated, in the thermodynamic limit Green's functions are only composed of planar diagrams and horizontal 
lines. On the other hand, all 1LI diagrams can be obtained by adding a propagator to some proper planar diagram 
(see Figures [8] and |9|) . This observation allows to establish two relations between the Green's functions and the self- 
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energy functions (which contain all possible 1LI diagrams). Recalling the form of propagators (see equation (|A3[) h 
such relations can be written as 

E(z) = CTr[g c (z)] (A7) 
S(z) = I T Tr[Cg c (z)]. 

Equations (|A4[) . (IA6|) and (|A7[) form the so called set of Dyson-Schwinger equations, which can be solved for g c by 
consecutively eliminating S c , g c and S c . This yields 

zg c (z) = ZG C (Z) , Z= I (A8) 

Tr[g c (z)] 

where Gc is as in equation (|3"4")l . By carrying out all calculations explicitly one finds 

Z = ; 5 — ~, m r- (A9) 

1 + qiN- 1 Tr[zg c (z)]-1) V 1 

Recalling the definition of the moment generating functions (see equations (|35[) and (|36[) ). one can see that the 
relations in (|A8[) and (IA9|) complete the derivation of equations (|39|) and (|40|) . which was the goal of this Appendix. 
Eventually, let us mention (as already pointed out in [12j) that in the limit of very large samples, i.e. when T — > oo 
with N fixed, one has q = and consequently equations (|A8|) and (|A9|) yield g c (-z) = Gc(z). This, of course, would 
cause the corresponding spectral densities to be identical, giving a rigorous meaning to the intuitive statement that 
in the limit of a large number of observations the eigenvalue spectrum of the C matrix is faithfully reproduced by its 
estimator c. 
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